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We report an experimental realization of an adaptive quantum state tomography protocol. Our 
method takes advantage of a Bayesian approach to statistical inference and is naturally tailored for 
adaptive strategies. For pure states we observe close to 1/N scaling of infidelity with overall number 
of registered events, while best non-adaptive protocols allow for 1/y/N scaling only. Experiments 
are performed for polarization qubits, but the approach is readily adapted to any dimension. 

PACS numbers: 



The main goal of quantum state tomography is to pro- 
vide an estimate p for an unknown quantum state p based 
on the data collected in a series of measurements [1[ . The 
estimator is supposed to be close to the real state in some 
reasonable sense, therefore various notions of statistical 
distance between quantum states are used 0, Q . One of 
the possible measures of statistical distance is infidelity 
H, defined as 1 - F(p,p) = 1 - Tr (/ypp) 2 . The 
ultimate goal of any tomographic protocol is to minimize 
this distance for a fixed overall number of measurements 
made N. Usually a protocol makes use of some fixed 
number of measurement settings predetermined before 
the actual experiment. For such a protocol the infidelity 
scales as 1 — F ~ N~ x / 2 for the most interesting for ap- 
plications set of almost-pure states. One can more or less 
significantlyalter the pre- factor by a clever choice of mea- 
surements p-lll], but the scaling is unaffected. A natural 
question is whether it is possible to beat this limit? The 
answer turns out to be positive if one allows for adap- 
tivity - the measurement performed at some step of the 
protocol should be determined in dependence of the data 
obtained in the previous ones [1, Q . 

Here we report an experimental approach to adaptive 
quantum state tomography based on a recently proposed 
adaptive Bayesian estimation algorithm ■ We achieve 
almost 1/N scaling of infidelity for pure states of polar- 
ization qubits and demonstrate a clear advantage over 
best symmetric non-adaptive protocols. Our approach 
is completely different and more general than that of an- 
other recent experimental realization [ill ] , where adaptive 
measurements were used to estimate a single unknown 
parameter of a quantum state. 

Bayesian tomography. Let us start with describing 
a general framework for quantum state estimation and, 
in particular, the Bayesian approach. A tomographic 
protocol is a set of positive operator valued measures 
(POVM's) M = {M a } with index a numbering the dif- 
ferent configurations of the experimental apparatus. In 
a given configuration, the probabilistic outcome of each 
measurement 7 being observed is determined according 



to the Born's rule: 



P( 7 |p,a) = Tr[M a7 p], 



(1) 



where M ai are POVM elements, obeying 
/, and p is the density matrix of the state to be deter- 
mined. The set T> of all outcomes observed in an exper- 
iment form the data set used to estimate density matrix 
elements. The Bayesian approach to statistical inference 
dictates the following rules: 

• a prior distribution over the space of density ma- 
trices p(p) is specified; 

• the collected data are used to obtain the posterior 
distribution p(p\T>) oc £(p;T>)p(p), £(p;T>) is the 
likelihood function, and it contains our statistical 
model that encodes probabilistic mapping from the 
state to the observed data; 

• quantities of interest are estimated using expected 
values under the posterior distribution: for exam- 
ple, we may obtain the Bayesian mean estimate of 
the state as p = HL p ( p \-p) [p]. Variance, infidelity 
or any other statistical quantity of interest may be 
obtained in the same way. 

The Bayesian approach has many advantages over 
a more standard for quantum information community 
maximum-likelihood estimation (MLE) It offers, in 
a natural way, a distribution over the space of density 
matrices, which provides the most complete description 
of our knowledge about the quantum state, inferred from 
data T> 12j. Even more importantly for us, it is a natural 
framework for construction of adaptive estimation proto- 
cols. Indeed, the posterior distribution may be updated 
as soon as one observes some data, in the extreme case 
- after each measurement, and the new knowledge about 
the state may be used to select the next measurement 
setting a in a most optimal way. Choosing the criterion 
for "optimality" is a task of optimal experiment design 
(OED) and may be solved in various ways. In Bayesian 
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framework a natural strategy is to choose a measure- 
ment, maximally reducing the entropy of the posterior 
- it means, that our knowledge about the state, obtained 
after such measurement is maximized floj | . This may be 
formulated as choosing a measurement configuration a 
as a solution to the following optimization procedure: 

a = argmax{H[p(p|X>)] - E p ^ a ^B.\p(p\j,a,T>)]} . 

(2) 

Note, that because we do not know which outcome 7 will 
be observed we use the expected information gain (under 
the posterior) as our objective. Before describing how 
to work with ([2]) in practice we detail the components 
required for our Bayesian model. 

The likelihood function. The likelihood function is 
equal to the probability of the observed data, given a 
particular state, i.e. £(p;T>) = ¥(j\p,a). In the simplest 
setting, i.e. in the absence of any experimental noise, the 
likelihood function is given directly by Born's rule ([TJ. 
In practice experimental noise also needs to be modelled 
in the likelihood function, we present these extensions to 
the simple model later in the paper. 

The prior. As the Bayesian framework implies find- 
ing a probability distribution instead of a single point 
estimate, the analysis should also take into account the 
particular geometry of the space, i.e. the geometry of 
single qubit density matrix space. In general, the ge- 
ometry of space is defined by its metric, which provides 
a notion of distance. In the case of density matrices 
the natural choice of metric is Bures distance, defined 
as d 2 B (pi,p2) = 2 — 2y /, F(pi, p2). Locally it coincides 
with the concept of Fubini-Study (quantum angle) mea- 
sure and the Hilbert-Schmidt distance (see Ref. [3[)- As 
follows from its definition, for close-by states the Bures 
distance is a square root of infidelity. 

It can be shown that the curvature of Bures metric 
space for single qubit states is constant fl3| . thus it can 
be isometric to a hypersphere in a 4-d space. A sim- 
ple isometric mapping exist between the hemisphere of 
radius 1/2 and the Stokes parameters: 
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Thus a uniform distribution in the Bures distance sense 
is a projection of a uniformly populated 3-sphere to the 
space of Stokes parameters. This projection apparently 
gives a lower density in its center and a higher near its 
surface. 

The first step in any Bayesian estimation procedure 
is to choose a prior distribution. In the ideal situation 
a prior should give absolutely no information about the 
system, thus a typical choice for prior is a Haar measure 
in the space in question. In our case we use Haar measure 
in Bures metric space. However, to stick with conven- 
tional (and very convenient) parameterization with three 



Stokes parameters we use the above mentioned isomc- 
try from a 3-sphere. Samples obtained by this procedure 
concentrate more along the surface of the Stokes param- 
eters ball, which is natural in the sense that the distance 
and, thus, fidelity between the samples remains uniform 
over the whole space. This would not be the case if the 
ball was populated uniformly, as fidelity is not directly 
connected with separations in the space of Stokes param- 
eters. 

We should also mention here another strategy exten- 
sively used in literature for prior generation [12| . Accord- 
ing to it a density matrix for a single-qubit is treated 
as a result of tracing out other qubits from a higher- 
dimensional pure state. Starting from different dimen- 
sions one gets different prior distributions, however with 
the increase of dimensionality the distribution tends to 
concentrate around the completely mixed state giving 
less and less chances to pure states. This unnatural be- 
havior prevents us from using it. 

To illustrate the above, in Fig. Q] we show density of 
samples in a flat disc cut from the center of the Stokes 
parameters ball for the Haar measure (a), and for the re- 
sults of tracing down from a two-qubit (b) and a qubit- 
qutrit (c) pure states. The first distribution has more 
density towards the circumference following the behavior 
of fidelity in the Stokes parameter space. Other distribu- 
tions give more and more favor to mixed states, unveiling 
their inadequatcness for our purpose. 




Figure 1: Population of samples in the space of Stokes pa- 
rameters vs. S x and S y condition to \S Z \ < 0.05. Samples 
are derived from (a) Haar measure in Bures metric space; two 
qubit (b) and qubit-qutrit (c) pure states traced by the second 
particle. The Haar measure gives flat infidelity distribution 
between all samples, while the latter two favor mixed states 
giving poorer fidelity to pure states. 

Approximate Inference One of the principle reasons 
that Bayesian methods enjoy less popularity in quan- 
tum tomography than MLE is the fact that posterior 
normalization requires computing an (in general high- 
dimensional) integral of likelihood function, which is 
computationally hard. Usually, when faced with in- 
tractable Bayesian inference, the posterior is approxi- 
mated via sampling [161 ] . or by approximating the pos- 
terior with a simpler distribution, such as a Gaussian 
17 1 . 

This computational difficulty is further compounded 
when one is performing adaptive quantum tomography, 
one must keep track of the current posterior after making 
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n measurements, in order to calculate the optimal n+l'st 
measurement. To perform inference based on all the ob- 
served data is at best an 0{n) operation, which becomes 
increasingly problematic as the experiment progresses. 
Fortunately fast algorithms for solving online Bayesian 
inference problems exist; they update the posterior after 
inclusion of each new datapoint, without re-visiting all 
previous data. We briefly review the core idea behind 
this approach, and refer the reader to [18[ for details. 

The algorithm is a variant of sequential importance 
sampling algorithm (SIS) with resampling. The idea 
is to construct a particle filter, approximating the pos- 
terior with a set of weighted samples, i.e. p(p\T> n ) rj 
X)f=i Ws S(p — p s ). After each observation one updates 

(n) 

the weights Ws , this can be done incrementally, using 
the current set of particles and weights, and the like- 
lihood corresponding to the new observation. The al- 
gorithm has O(l) cost for integration of the likelihood, 
which means that it can be applied on-line at every 
step of the adaptive protocol, irrespective of the current 
amount of data collected. 

A common problem with weighted particle filters is 
that after collection of sufficient data, a small collection 
of particles get almost all of the weight, this means that 
the effective sample size reduces and the quality of the 
approximation to the posterior becomes poor. This prob- 
lem is avoided by monitoring the effective sample size 
and re-sampling the particles when it falls too low. The 
resampling procedure uses two phases, firstly the par- 
ticles are re-drawn from the set of current particles in 
proportion to their weight (and the weights are re-set to 
zero), then they are 'spread back out' using the Metropo- 
lis Hasting's algorithm. It is important to note that im- 
plementation of Metropolis-Hastings algorithm is conve- 
nient to perform on a 3-sphere surface defined by ((3|), 
which automatically ensures correct step sizes and avoids 
unnecessary boundary conditions on the surface of the 
Bloch sphere, which, in general, may bias the posterior. 
Using the particles, whenever we require an expectation 
under the posterior, e.g. when computing the mean fi- 
delity, or when computing the next state for adaptive to- 
mography, one can simply replace the complex integrals 
with simple weighted Monte Carlo estimates. 

Efficient Adaptive Tomography We now return to 
computation of the objective function for adaptive to- 
mography ([2]), although this objective is theoretically 
attractive, even with the sampling estimate of the pos- 
terior two major computational difficulties are encoun- 
tered. Firstly, we must compute entropies of (in general) 
high-dimensional quantum states; it is notoriously hard 
to compute entropies directly from samples from the dis- 
tribution Secondly, one requires the posterior dis- 
tribution p{p\"f, a, T>) for all possible next measurements 
7, and all their possible outcomes a; this would require 
performing an SIS update for all these possible scenarios. 



This computational burden is unavoidable if one uses a 
loss function other that the log loss, which leads to the 
Shannon's entropy objective function, which means that 
optimal designs can only be computed for very short ex- 
periments [20J . Therefore, it is highly beneficial to work 
with the following equivalent formulation: 



a = argmax {H [p(j\a, V)] - E p ( p \v) 



:[p(7hp)]}- (4) 



In ([?]) only predictive entropies are required, which is 
much easier because output space is typically much lower 
dimensional than state space, and only the current pos- 
terior is needed p(p\T>). 

Simulations We performed simulated experiments to 
empirically evaluate the performance of Bayesian adap- 
tive tomography. For our performance metric we use 
the mean infidelity as measured against the true state, 
p: 1 - F{p,p) = K p(plVn) [l- F(p,p)}. Note that 
Bayesian mean 1 — F(p,p) is a 'fairer' score than the 
fidelity of a point estimate, e.g. the posterior mean 
(i.e. [1- F(E p{plVn) [p],p)]). The fidelity of the pos- 
terior mean does not take into account the uncertainty 
captured by the posterior. The posterior mean could be 
correct, for example, if the state is pure and the pos- 
terior has become 'flattened' against the surface of the 
Bloch sphere; the variance, however, could be very high 
- the system may have little knowledge of the polar and 
azimuth angles (in the Bloch sphere) of the true state. 
The Bayesian estimator rewards posterior distributions 
that are centered in the correct location and have low 
variance. 

To achieve statistically significant results we perform 
multiple runs within each simulation. For each run we 
generate a random pure state which we use as the true 
state p. We average over 20 runs, each with a different 
true state. All measurements performed in a single run 
are shown in Fig. [2] After the system collects some initial 
information about the measured state it tends to choose 
measurements aligned with its current estimate of the 
true state (although not exclusively) , thus taking advan- 
tage of the adaptive approach. Interestingly, the optimal 
first three measurements chosen correspond precisely to 
a set of Mutually Unbiased Bases (MUBs), but diverge 
from these MUBs on the 4th. We compare the results to 
measurements chosen randomly and uniformly, and se- 
lecting randomly from a set of MUBs, which have been 
shown to be the optimal (in terms of information gain) 
non-adaptive measurement set one can choose prior to 
the experiment [2l| . We also fit a power law, of the form 
1 — F oc N a to the data, in order to compare the conver- 
gence rates of the different methods. Note that the per- 
formance of the random and adaptive schemes is indepen- 
dent of the angle of the true state (the algorithms remain 
the same given any rotation of the Stokes coordinate), 
however, the fixed MUBs are not. Drawing intuition from 
the fact that the adaptive scheme selects mostly mca- 
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surements that align with the true state (hence 'squash- 
ing' the posterior against the Bloch sphere), we find that 
the 'best case' for MUB tomography is when the state 
is aligned with one of the MUB measurements, and the 
worst case is when the state is equally biased to all mea- 
surements i.e. {a x = a y = <r z = 1/^/3}. 

The results are presented in Fig. [3] Firstly, note 
that random tomography yields a 1/y/N rate as ex- 
pected, a = 0.448 ± 0.183. However, adaptive tomog- 
raphy performs close to the at 1/N level on average 
a = 0.915 ± 0.101. In its most favorable scenario, MUBs 
also perform also close to the 1/N rate (with a small 
multiplicative constant improvement over the adaptive 
scheme). In practive, becuase the optimal MUBs depend 
on the state, these are not known a priori. In the worst 
case MUBs achieve 1/VN convergence; but we also ob- 
serve that on average the rate is closer to 1/VN than 
1/N, a = -0.593 ±0.134. 
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Figure 2: Illustration of an adaptive choice of measurements 
at reconstruction of a pure state. The Bloch sphere is given 
in three orthogonal projections with each measurement basis 
marked as a dark point. As expected, measurement bases 
tend to concentrate around the reconstructed state or the 
symmetric state on the other side of the sphere. 

Measurement blocks. One can save computational 
and experimental time by taking blocks of measurements 
using the same configuration for k consecutive measure- 
ments. As the posterior collapses towards the true state, 
the optimal measurement changes less frequently - a 
direction pointing towards the true state becomes in- 
creasingly preferable. Therefore, as the experiment pro- 
gresses the block size k can be allowed to grow with- 
out detrimenting the quality of the experimental proce- 
dure. We use a heuristic block-size schedule which in- 
creases the block size at a 0(n) rate, where n is the 
number of measurements seen so far. In particular we 
use k = max( [n/lOOj , 1); in this case the achieved infi- 
delity scales linearly with elapsed time. Simulations show 
that using this schedule does not make any statistically 
significant difference to the convergence rate, or even to 
the absolute fidelity achieved at any time. 

Experimental imperfections. In practice quantum to- 
mography is inevitably subject to experimental noise. 
This noise is not modeled in the likelihood function given 
by Born's rule ([T]), and therefore, its presence may bias 
the results of inference, reducing both the fidelity of the 
inferred state, and the optimality of the adaptive experi- 



Figure 3: Simulated tomography using three measurement se- 
lection methods, randomly sampled (red), MUBs (blue) and 
fully adaptive Bayesian tomography (black). For these meth- 
ods, the true state is random and pure, the results presented 
here are the average of 20 independent runs. Overlaid dashed 
lines indicate the power law fit. Functions 1 — F = cN^ 1 ^ 2 
(magenta) and 1 — F = cN^ 1 (cyan) are shown for compar- 
ison. The performance of random and adaptive tomography 
is clearly independent of the Bloch-sphere angle of the pure 
state, the fixed MUBs are not. Therefore, we also present the 
performance of MUB tomography given the 'worst' and 'best' 
(see text) true states (dark, light green respectively). 



mental design. For out set-up we have identified two ma- 
jor additional sources of experimental noise. Firstly, the 
presence of detector dark counts with detector-specific 
rates. Secondly, attenuation in both channels due to de- 
tector inefficiency and losses/reflections at the optical el- 
ements. If the attenuation was equal in both channels, 
then the inference would be unaffected; however, unequal 
attenuation will bias the posterior. 

A popular approach to modeling the additional un- 
certainty in the state is to model the observed state as a 
linear mixture of the true state and the maximally mixed 
state [l5| . Although with this assumption one can model 
certain simple noise sources, such as equal dark counts 
arriving with equal rates at each detector, we address 
the specific sources of noise in our experimental paradigm 
more directly. To model dark counts, we assume that the 
production of photons by the laser source, and the arrival 
of dark counts at the detectors can be modeled using in- 
dependent homogeneous Poisson process with (constant) 
rate parameters A s for the source and for each de- 
tector. These rates are estimated a priori using a pilot 
experiment. From these assumptions one can derive the 
following likelihood function using the properties of the 
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Poisson distribution: 



P( 7 |p,a,A i> Aj J ... > A2) 



Tr[M ay p]X s 



A2 



A s + 



S 7 



(5) 



To deal with channel efficiency, we assume that there is 
a fixed probability of a photon being 'lost' from a channel; 
this probability is denoted 1 — ry 7 for each channel 7, and 
hence ry 7 may be interpreted as the 'channel efficiency'. 
These probabilities are also estimated in a preliminary 
experiment. Given these efficiencies the likelihood be- 
comes: 



Tr[A/ a7 p]?7 7 
V Tr[A/ Q7 p]? ?7 



(6) 



Note that in both cases, both the numerator and de- 
nominator contain only linear terms in the additional pa- 
rameters (A, 77). Therefore, one only requires estimates 
of the ratio of the dark count rates to the source rate 
A^/A,,, and, for single-qubit tomography, the ratio of the 
efficiencies of the two channels 771/772- It is straightfor- 
ward to show that this property also holds when one 
collects blocks of measurements in one configuration and 
computes the likelihood of all measurements in the block 
simultaneously. 

Experiment. A sketch of our experimental setup is 
shown in Fig. U We use a CW 850 nm VCSEL diode 
laser coupled to a single-mode fiber as a light source. 
The radiation is attenuated to the single-photon level by 
a set of neutral density filters F and additionally spatially 
filtered with small iris apertures. The input polarization 
state is defined by a Glan- Taylor prism GP with high 
extinction ratio (more than 6000:1), the prism transmits 
horizontally polarized light, which may be transformed 
to some arbitrary state with a custom quartz plate WP. 




Figure 4: Experimental setup. An attenuated laser is used as 
a source, the polarization state is prepared by a set of wave- 
plates, and analyzed by a sequence of a quarter- and half-wave 
plates, followed by a polarizing beam-splitter and two single- 
photon counters. Waveplates are rotated by electronically 
controlled step-motor drivers to allow for adaptivity. 



The measurement scheme consists of zero order 
quarter- wave plate QWP and half- wave plate HWP. The 
plates are rotated by step-motor-driven stages, with min- 
imal angular step of 0.1°. The zero position is controlled 
by a Hall sensor providing uncertainty of wave-plates zero 
of 0.2°. We clean up the polarization states in the out- 
put channels of a polarization beam-splitting cube (PBS) 
with two additional Glan- Taylor prisms to ensure high 
extinction ratio. Effectively that is equivalent to intro- 
ducing some losses in the non-ideal PBS without altering 
the output polarization states. In each channel photons 
are coupled to multi-mode fibers and detected by sin- 
gle photon counting modules Dl and D2 (Pcrkin-Elmer) . 
Electronic pulses from SPCM's are sent to home-made 
counters which may operate in two regimes - count for a 
fixed period of time or count until the specified number 
of counts is reached. 

To show the advantage of adaptive state estimation 
over non-adaptive protocols we performed a direct com- 
parison in the Bayesian framework. All non-adaptive 
protocols were shown to scale similarly in the limit of 
large N in our simulations, so we chose randomly sam- 
pled measurements for comparison. In the adaptive es- 
timation scheme we used two strategies: adaptation af- 
ter every single measurement and block measurements 
and found no statistically significant differences. Fig. [5] 
shows the dependence of mean infidelity 1 — F(p, p) = 
Ep( p |x>„) [1 — F(p,p)] with current estimate p (for which 
we used the Bayesian posterior mean) on the number of 
measurements performed. It is important to note that 
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Figure 5: Experimental results: mean infidelity 1 — F(p, p) 
with current Bayesian mean estimate p for non-adaptive mea- 
surements (black line) and adaptive measurements (red line). 
Data points are averages over 6 experimental runs. Solid 
straight lines are best power law fits. 

we intentionally did not average over many realizations 
at each step of the algorithm, data points in Fig. [5] are 
average over several full runs of the experiment. The 
convergence rate behaves regularly from run to run. Fit- 
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ting the data averaged over 6 realizations with power law 
of the form 1 - F oc iV a , we obtained a = -0.700 ± 0.005 
for non-adaptive protocol, and a = —0.889 ± 0.003 for 
adaptive one. 

In a real world application of tomography the 'true' 
state is unknown, and the Bayesian estimate given above 
is the only figure of merit at hand. However, in our 
experiments, we can estimate the prepared state inde- 
pendently. To do this we performed a very large num- 
ber of Stokes parameters measurements. Let us denote 
the 'true' state, estimated this way, p, we may now an- 
alyze the protocol performance using the mean infidelity 
with 'true' state 1-F(p,p) = E p ( p |x>„) [1 - F(p,p)}. The 
scaling with TV of this quantity is depicted in Fig. [5] 
Power law fits in this case give a = —0.502 ± 0.001 and 
a = —0.902 ± 0.008 for the random and adaptive strate- 
gies, respectively. Error bars here are from the fit of an 
average curve, and within errors experimentally obtained 
scaling laws agree with simulations. 

Our model does not take into account systematic er- 
rors caused, for example, by inaccuracies in waveplatcs 
rotation. However for the reached values of infidelities 
of on the order of 10~ 4 — 10~ 3 we did not observe any 
deviations from expected behavior and were not able to 
identify the influence of systematic errors. Further inves- 
tigation with much larger statistics is required to address 
this issue. 
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Figure 6: Experimental results: mean infidelity 1 — F(p, p) 
with "true" state p for non-adaptive measurements (black 
line) and adaptive measurements (red line). Data points are 
averages over 6 experimental runs. Solid straight lines are 
best power law fits. 

Conclusion. Our experimental results clearly demon- 
strate the advantages of adaptive strategies in quantum 
state tomography. We have adapted Bayesian methods 
of state estimation, because Bayesian methods maintain 
confiendence levels, and error bars with their estaimtes, 
they are a very natural tool for the task of adaptive ex- 
periment design. Besides the aforementioned favorable 



properties, the Bayesian approach is convenient from a 
purely practical point of view. It does not require any 
additional precomputation, and since posterior updates 
may be easily carried out after a single detection event, 
we expect that this approach will be particularly use- 
ful in the case of extremely weak signals. The iV -1 
scaling of infidelity in the adaptive case is the theoret- 
ical limit for any tomographic protocol, and further im- 
provement may only affect pre-factors in this power law. 
Simulation results show that our strategy of choosing 
adaptively between general measurements outperforms 
any non-adaptive protocol, and although only results for 
completely random measurements and fixed bases are 
provided here, experimental work showing worse per- 
formance of more sophisticated non-adaptive strategies 
is underway. Finally, let us note that using an attenu- 
ated laser source is absolutely equivalent to a true single- 
photon source for the purposes of this particular sin- 
gle qubit experiment. Generalization of the developed 
adaptive protocol for two-qubit polarization states, and 
higher-dimensional systems (like spatial modes of the 
biphoton field) will be reported elsewhere. 

After this paper was completed we have become aware 
of a highly relevant work [14J taking a different approach 
to adaptive state estimation and achieving similar per- 
formance. 
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